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Ubiquitin is a 76-residue protein highly conserved among 
eukaryotes. Conjugation of ubiquitin to intracellular proteins 
mediates their selective degradation in vivo* We describe a 
family of four ubiqui tin-coding tod in the yeast Saccharomyces 
cerevisiae. UBI1, VBI2 and UBI3 encode hybrid proteins in 
which ubiquitin is fused to unrelated ('tail') amino add se- 
quences. The ubiquitin coding elements of UBI1 and UBI2 
are interrupted at identical positions by non-homologous ni- 
trons. UBU and UBI2 encode identical 52-residue tails, 
whereas UBI3 encodes a different 76-residue tail. The tail 
amino add sequences are highly conserved between yeast and 
mammals. Each tail contains a putative metal-binding, nucleic 
add-binding domain of the form Cys-X 2 -4-Cys-X 2 -i5-Cys- 
X 2 - 4 -Cys, suggesting that these proteins may function by 
binding to DNA. The fourth gene, UBU, encodes a polyubi- 
quitin precursor protein containing five ubiquitin repeats in 
a head-to-tail, spacerless arrangement. All four ubiquitin 
genes are expressed in exponentially growing cells, while in 
stationary-phase cells the expression of UBU and UBI2 is 
repressed. The UBU gene, which is strongly inducible by star- 
vation, high temperatures and other stresses, contains in its 
upstream region strong homologies to the consensus 'heat 
shock box 9 nucleotide sequence. Elsewhere we show that the 
essential function of the UBU gene is to provide ubiquitin 
to cells under stress. 

Key words: ubiquitin/multigene family/protein turnover/TFIEA 
motif/yeast 



Introduction 

Ubiquitin, a 76-residue protein, is apparently present in all 
eukaryotic cells, and is extremely conserved in evolution, with 
ubiquitin variants from animals, plants and yeast differing from 
each other in three or fewer residues out of 76 (Schlesiriger and 
Goldstein, 1975; Gavilanes et aL, 1982; Dworkin-Rastl et aL, 
1984; Ozkaynak et aL, 1984; Bond and Schlesinger, 1985; 
Wiborg et aL , 1985; Vierstra et aL , 1986). Ubiquitin exists in 
cells either free or covalently joined via its c^rrjoxyl-terminal Gly 
residue to a variety of cytoplasmic, nuclear and cell surface pro- 
teins (Busch and Goldknopf, 1981; Chin et aL, 1982; Levinger 
and Varshavsky, 1982; Hershko et aL , 1983; Siegelman et aL , 
1986; Yarden et aL, 1986). Recent biochemical and genetic 
evidence strongly suggests that conjugation of ubiquitin to intra- 
cellular proteins is essential for their selective degradation 
(reviewed by Finley and Varshavsky, 1985; Hershko and Ciech- 
anover, 1986; see also Bachrnair et al., 1986; Finley et aL , 1987). 
The exceptional evolutionary conservation of the ubiquitin 
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amino acid sequence, and the close similarity between ubiquitin- 
specific enzymes from yeast and mammals (S.Jentsch, 
J.McGrath, and A. Varshavsky, unpublished data) suggest that 
insights gained from molecular generic analyses of yeast ubiquitin 
should be relevant to other eukaryotes as well. We have previous- 
ly reported isolation from the yeast Saccharomyces cerevisiae 
of a ubiquitin-coding gene that was found to encode a polyubi- 
quitin precursor protein (Ozkaynak et al., 1984). Recently, 
several groups have reported the cloning of ubiqiutin-coding DNA 
sequences from a variety of eukaryotes. Polyubiquitin genes have 
been identified in all five eukaryotjc species examined, from yeast 
to mammals (Dworkin-Rasd et at. , 1984; Ozkaynak et aL , 1984; 
Bond and Schlesinger, 1985, 1986; Wiborg et aL , 1985; St. John 
etal., 1986). 

In the present work we describe the family of ubiquitin-coding 
loci in S. cerevisiae. One remarkable feature of this family is 
that all four of the yeast ubiquitin genes encode ubiquitin fusions, 
either to itself (the polyubiquitin gene) or to unrelated amino acid 
sequences. We have also found (see Finley etal., 1987), through 
the analysis of yeast strains with an engineered deletion of the 
polyubiquitin gene, that this gene is specifically required for 
resistance of cells to stress. 

Results 

Multiple ubiquitin-coding genes in yeast 
We have previously cloned and partially sequenced a gene that 
encodes the yeast polyubiquitin precursor protein (Ozkaynak et 
aL , 1984). A low-stringency Southern hybridization of electro- 
phoretically fractionated genomic DNA with the polyubiquitin 
DNA probe sugests the presence of additional ubiquitin-coding 
loci in S. cerevisiae (Figure 1 A, lanes a,b). The major hybridiz- 
ing band in either ///ndJU or HindSWEcdRl digests of genomic 
DNA corresponds to the polyubiquitin gene, whereas the minor, 
- 1.8 kb and -3.5 kb bands in Figure 1A, lane b are not ac- 
counted for by this locus, suggesting that the minor bands corres- 
pond to additional ubiquitin-coding genes. Genomic DNA in the 
regions of - 1 .8 kb and - 3.5 kb bands (Figure 1 A, lane d) was 
purified by gel electrophoresis and cloned into an M13 vector. 
DNA clones containing ubiquitin-specific inserts were then iso- 
lated and sequenced (see Materials and methods). This approach 
yielded three new ubiquitin-coding genes, UBU, UBI2 and UBI3 
(Figures 2 -4). UB14 designates the previously identified polyubi- 
quitin gene (Figure 5). One striking feature of these genes is that 
while all of them contain ubiquitin-coding sequences, none of 
them encodes mature ubiquitin (Figure 6). Although the 
UBU — UB14 genes encode identical amino acid sequences of ubi- 
quitin (Figures 2—5), they differ significantly at the nucleotide se- 
quence level (see below). Low-stringency Southern hybridization 
of genomic DNA with probes spanning each of die four ubiquitin- 
coding loci does not reveal any cross-hybridizing sequences that 
cannot be accounted for by the already identified ubiquitin genes 
(Figure 1, panels B-E, and data not shown). Thus it is likely 
that we have identified all of the ubiquitin-coding genes in S. 
cerevisiae. 
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Fig. 1. Ubiquitin-coding loci in S. cerevisiae. Hybridizations with cloned DNA probes were carried out as described in Materials and methods. DNA in all of 
die lanes except c and d is from strain S288C (Sherman et al., 1981). DNA in lanes c and d is from ubi4-b, strains SUB16 and SUB 17, respectively, in 
which the UB14 gene had been deleted (see Finley et al, 1987 for a detailed description of these strains). Lane a, HindUX digest; lanes b-d, HindSWEcoRl 
double digest; lanes e, g, i and k, Sspl digest; lanes f, h, j and I, HaeUUHmcTL double digest. Panels A and E: hybridization with a - 1 .3-kb BstXVBcll 
fragment of the UBI4 gene (Figure 5) that contains only ubiquitin-coding repeats. Panel B: hybridization with a 0.46-kb EcoKV/Accl fragment of the UBU 
gene (Figure 2) that contains both the ubiquitin- and tail-coding sequences of UBU. Panel C: hybridization with a 0.43-kb XmhUAccl fragment of the UBH 
gene (Figure 3) that contains both the ubiquitin- and tail-coding sequences of UBI2. Panel D: hybridization with a 0.57-kb SspVHincll fragment of the UBU 
gene that contains both the ubiquitin- and tail-coding sequences of UBI3, The indicated DNA sizes of marker DNA bands (not shown) are in kilobase pairs. 
Arrowheads indicate the ~ 1 . 8-kb and ~3.5-kb bands in lanes b-d that correspond to UBI1-UBI3 loci (see Materials and methods and main text). Weak 
bands at - 1.02 kb in lane g and -0.75 kb in lane h, that cannot be accounted for by the UBU-UBI4 genes, have been found in additional experiments to 
be due to an impurity in the 6W2-specific probe used (data not shown). 



UBU contains an intron and encodes ubiquitin fused to an 
unrelated amino acid sequence 

The nucleotide sequence of the UBU gene is shown in Figure 
2. The amino acid sequence of the ubiquitin portion of the fu- 
sion protein encoded by UBH is identical to the amino acid se- 
quences of ubiquitin encoded by UB12, UBI3 and by all five of 
the ubiquitin-coding repeats of UBJ4 (Figures 2— 5). At the same 
time, there are significant differences at the nucleotide sequence 
level between the ubiquitin-coding elements of UBU - UBI4. For 
instance, the ubiquitin-coding portions of UBU and UBI3 (Figures 
2 and 4) differ at - 8% of the base positions, with most of the 
differences confined to the third positions in the corresponding 
codons. Moreover, the ubiquitm-coding element of UBH, in con- 
trast to those of UBI3 and UBU, is interrupted within the third 
codon by a 434-bp intron (Figure 2). The splice junctions of the 
UBU intron are typical of those in yeast introns, with only one 
nucleotide at the 5' junction, marked by a small dot in Figure 
2, diverging from the strongly conserved consensus sequence of 
this site in S. cerevisiae (Langford and Gallwitz, 1983; Teem 
and Rosbash, 1983; Leer et al y 1984; Green, 1986; Vi- 
jayraghavan etal , 1986). The TACTAAC 'box*, essential for 
splicing in S. cerevisiae, is also present within the intron (Figure 

2) . The UBU intron has no other obvious homologies to either 
previously identified yeast introns or to the intron of the UBI2 
gene, which encodes an identical fusion protein (Figures 2 and 

3) . While most of the genes for yeast ribosomal proteins contain 



introns, few other yeast genes do (Leer et al. , 1984; Teem et 
al. , 1984; Green, 1986) . Both the scarcity of yeast introns and 
their non-random distribution between genes of different func- 
tional classes suggest that some yeast introns serve a regulatory 
function. In one case this expectation has been confirmed (Dabeva 
et al., 1986). No evidence on the possible function of introns 
in the UBU and UBI2 genes is currently available. 

The fusion protein encoded by the UBU gene is 128 residues 
long and consists of ubiquitin followed by an unrelated 52-residue 
sequence (Figure 2). Although there is no direct evidence that 
the fusion protein is efficiently cleaved in vivo between residues 
76 and 77 (Figure 2) to yield mature ubiquitin, recent insights 
into substrate requirements of the apparently ubiquitin-specific 
processing protease (Bachmair et al., 1986) suggest that such 
cleavage does occur. In particular, it has been found that a yeast 
protease that cleaves the engineered ubiquitin -/?-galactosidase 
fusion proteins in vivo at the ubiquitin -0-galactosidase junction 
is generally insensitive to the nature of the first residue of 0- 
galactosidase at the junction (Bachmair et al., 1986). 

The 52-residue tail of the UBI1 fusion protein is basic, con- 
taining 31 % lysine and arginine residues. A highly basic stretch 
of seven residues at its carboxyl terminus (Figure 2) strongly 
resembles a sequence motif shown previously to be required for 
protein localization to the nucleus (reviewed by Dingwall and 
Laskey, 1986). These data suggest that either the intact fusion 
protein or its 52-residue tail may function as a nucleic acid bind- 
ing protein. This possibility is strengthened by the presence within 
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(J B 1 1 AOOCriTATtKMTOlTWUTTTCTATOCOCACCOT SO 

CCOTTCGCTACTTtKlAaCCACTATCaACTACOCGATCATOCOT 160 

TCTTCTTTAOCTOTOTTCTTCTGCTCAOAiaATTCATC 240 

TTT0ATCTCATC0ACCAAAACATTOAQ 1 1 IlUlli ACTT C CTCTT C AOAACOTOAAACTTOOCT AI 1 1 It 1 IU TATAACT 320 

TTCTTAAOACAMATAOOTAlCACCTWAAflTOATTCOAlTCOAACA 400 

GQTTCTQTCQQCCCAaTQTTCrrTCQCCTTAAA 1 1 11. 1 1 1 llK iTAAACTTCPAlTTAQl AA AAflA AA AQPr^rr^Aqi AA 4 80 

TAAOCQCTTOC AATTTAACQCOQAT ATO U 1 It 1 it 1 It 1 ATTATCTTCAATTATCQATCA1CTCTATCCA1CA ATTCT 560 

ATAATATCCACTOTTCATTAACOAATATTOO TCTTT I ICCC 1 1 ATO0TOAJU3TAAATTTTCCAT0CAATATCC00QTAA0 640 

CTATOTACAAOTTTAITQACTOCAATTTOAGTrTATTACATCOOT^ 720 

CTCCOTOTMATATTTCOTO(UOCA11CCAQAAJLU3A 800 

CCTCCTATQQQ ATOQQ 1 111UI 1U TAC IC i 1 I I t 1 1 1 t TAQ ACAOO ACCTCOO ATTtK^CTCCCTOAOOQTQ AO ATOOTTT 880 

CCOOCCTCAQOACTOCCTTCTCCAOTTTCT^ 960 



AOOTAOTTOAATCTCTATTTOTTQTTQTTATTACOQCTTATTATCCCATAflTn ATO 1036 

Met 1 

CU AT OTATOCACCATATCCATTCTAMCATAOTTTTT^ U13 

Olnlle 3 

TOAATTTTTACCCAQTO0AATAACATCOTATCTOTAA10 1193 

ACTOCC AAACTO AAT ATO AQO AA C T T T CC TC T C T AQOAATO ACTTAOTO AATOT AC AOTO ACTTOTOC AAAATATO ATTA 1273 

QATTTTOAOOKWTOATOCOACTTUCAOTCTCATTKCT 1353 

TATQACOATO AO AOCTCOTTTAAAA 1 i I iUltT It I IIUIAACAOTTATQAAAAAACTATTACOTOTTTTTATOATATCC 1433 



IJACTAACTTOTCATTTTTTTATAAAATTATTTTTTTAACAO T TTT OTC AAO ACT TTO ACT OOT AAO 1500 

•** Phe Tel Ljra Thr Uu Thr Oly Lye 11 



ACC 


ATC 


ACT 


TO 


OAA OTT 


OAA 


TCT 


TCT 


OAC 


ACT 


ATT 


OAC 


AAT 


OTC 


AAO 


TCA 


AAO 


ATT 


CAA 


1560 


Thr 


lie 


Thr 


Leu 


Olu TU 


Olu 


Ser 


Ser 


Aep 


Thr 


lie 


Aep 


Aea 


Yal 


Lye 


Ser 


Lye 


lie 


Ola 


31 


QIC 


AAO 


OAA 


OOT 


ATC CCA 


CCT 


OAC 


CAA 


CAA 


AOA 


TO 


ATC 


TTT 


OCT 


OOT 


AAO 


CAA 


TO 


OAA 


1620 


Aep 


Lya 


Olo 


Oly 


He Pro 


Pro 


Aep 


Ola 


om 


Arg 


Leu 


lie 


Pae 


Ala 


Oly 


Lye 


Ola 


Leu 


Olu 


51 


a ac 


OOT 


AOA 


ACC 


TTO TCT 


OAC 


TAC 


AAC 


ATT 


CAA 


AAA 


OAA 


TCC 


ACT 


TO 


CAC 


TO 


OTC 


TO 


1680 


A*P 


Oly 


Arg 


Thr 


Leu Ser 


Aep 


Tyr 


Aea 


He 


Ola 


Lire 


Olu 


Ser 


Thr 


Leu 


Hie 


Leu 


Tal 


Leu 


71 


AO A 


TO 


AOA 


OOT 


OOT^ATC 


ATT 


OAA 


CCA 


TCT 


TO 


AAA 


OCT 


TO 


OCT 


TCC 


AAO 


TAC 


AAC 


TOT 


1740 


Arg 


Leu 


Arg 


oiy 


Oly lie 


lie 


Olu 


Pro 


Ser 


Leu 


Lye 


Ale 


Leu 


Ala 


Ser 


Lye 


Tyr 


Aea 


Cye 


U 


OAC 


AAA 


TCT 


OTT 


TOC COT 


AAO 


TOT 


TAT 


OCT 


AOA 


TO 


CCA 


CCA 


AOA 


OCT 


ACC 


AAC 


TOT 


AOA 


1800 


Aep 


Lye 


Ser 


Tel 


Cye Arg 


Lye 


Cye 


Tyr 


Ale 


Arg 


Leu 


Pro 


Pro 


Arg 


Ala 


Thr 


Aea 


Cye 


Arg 


111 


AAO 


ADA 


AAO 


TOT 


OOT CAC 


ACC 


AAC 


CAA 


TO 


OOT 


CCA 


AAO 


AAO 


AAO 


TTA 


AAA 


TO A 


TCTOTTTC 


1863 


Lye 


Arg 


Lye 


eye 


oir Hie 


Thr 


Aon 


am 


Leu 


Arg 


Pro 


Lye 


Lye 


Lye 


Lau 


Lye 
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OCCCAATCAATCTATACATOATTTCACTOTATACTTTAATOTACA TCT Itl 1 1 i ATC ATACCTACTOTTAAAATC ATAA 1943 



TATTAATTCATACAATATACTTAAOAAAACAATATACTTTTCTTAOTTTCATAA TOTTTTTT 2023 

TTTTTTTAAOTATATTTOTATATTATOCATATAAOTCAACOCOTAOQ 1 It It 1 TA CT 2103 

ATTCTTCOOTOTTCATTOTTA I HTTCTTTCCC lU ATTATOCTOCACAAAACTCAOTAAOAATAQTTTTCAArrTrOO 2182 

1C 1 11 1U C A I IUU 1 1 1UJ ATAJAJATCCOTTCATTCCTO ATTCCAAACA 1 101 1 1 AATOTTOCTATCOTC AQCAAAAOCO 2262 

OTTAOAOCOACAATAOOTOAOGTATAACCTAAATCQCOC 2342 

CTOOACATCCATOAAAATCATATTATAA T IT ICOCCC 1 1 AOATOTCAATTCTTTAACTTTOTCOAATOCrTCTTOOCCAT 2422 

COCAAOCCAOTTCAATATTTTCAATOCC 2450 

Fig. 2. Nucleotide sequence and deduced amino acid sequences of the UBI1 locus. The TACT AAC box within the intron and the conserved intron sequences 
abutting the 5' and 3' slice junctions are marked by large dots. A nucleotide residue that does not match the consensus sequence for the 5' splice junction 
(Leer et at., 1984) is marked by a small dot. An arrowhead indicates the site of proteolytic cleavage that would be required to generate mature ubiquitin from 
the primary translation product. A stretch rich in basic amino acid residues that resembles a nuclear localization signal (Dingwall and Laskey, 1986) is 
underlined in the tail sequence. Wavy lines mark a bipartite nucleotide sequence motif (a CT-rich block followed, a variable distance downstream, by the 
sequence CAAG) common to the upstream regions of many but not all yeast genes; its function is unknown (Dobson et at. , 1982). A nucleotide sequence 
motif that resembles the consensus sequence for transcription termination in yeast (Zaret and Sherman, 1982) is marked by small dots. We have also 
examined the coding and flanking regions of the UBI1-UB14 genes for the presence of several other known regulatory motifs, such as sequences present 
upstream of the genes for ribosomal proteins (Teem et c/., 1984; Woudt et a/., 1986), sequences responsible for the cell cycle-specific expression of certain 
yeast genes (Nasmyth, 1985), and the known recognition sequences for regulatory proteins of the yeast mating system (Miller et a/., 1985; Russell et at, 
1986). No significant homologies to any of the above sequences have been detected in VBU-VB14. 
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U B 1 2 ^TCAATTCATTOCTTOAGATATTAACOCOTTAGOTTOTOTTCTTC 80 

GGAAGCAATAAGOTAACAGCGAAATTTATGACATATTATTTCGAACCTTTTACA^^ 1(0 

CTATTGGCATTCATTTOTOTTCTATATOTGGATOAGGATAGCOGCC^ 240 

AATCCTTTTTAAATACT ATTTCC ATCCOTOCCTCT AAT AO ATTTGTOTAGTTOTCTOCGTOC AATCTTTCCATTTTTGCT 3 20 

QAAc mrmmm 400 

TCATACTAGCCATTACCCATCTATCCCAGGCATTATGGGTATGCAACTCATAATC 480 

TATATATC ACTTrrTCCCTTTCA QCAAQAOOTAAAOCCACCAAAOQTTCAAA ATO CAA AT OTATOTTACGQCOA 554 
Met Oln lie •••••• 3 

ATACA0AATACTATOTTC0AAATAATATOA0QATTATACOATA0CAAAAAA0CCATAAACQAAAOACA 654 

GATTGACAAGCTCACAATTTATTAAACAAGTAGCAATTGAGAAAAACTATTAC 714 

AATCTOTAQAOCAAATTO AAAATOTCOCATATOTQCTO AAQQQ I I TOT 1 10 1 fC CATCTTATTTTOCATAACATAQTTAT 794 

ATTTACTTOQTCOCATAAAAAATA TTTTTT ACTAACQTQAAQ X I'lt^T ITfC T1T ATOATQTACQCACQCAC0TCTOTCTTA 874 

••••••• 

CTCCATAAATGAACTTATTCCAATTTTGTACAG C TTC OTT AAO ACT TTO ACT GOT AAA ACC ATC ACT 941 

Phe Val Lys Thr Leu Thr Gly Lye Thr lie Thr 14 



no 

Leu 


GAA GTT GAA 
Olu Val Glu 


TCT 
Ser 


TCT 
Ser 


GAC ACC 
Asp Thr 


ATT 
He 


GAC 
Asp 


AAT 
Asn 


GTC 
Val 


AAG 
Lys 


TCC 
Ser 


AAG ATC CAA GAC AAG GAA 
Lys He Oln Asp Lys Glu 


1001 
34 


GGT 

Qiy 


ATC CCA CCT 
He Pro Pro 


GAC 
Asp 


CAA 
Oln 


CAA AGA 
Gin Arg 


TTO 
Leu 


ATC 
He 


TTT 
Phe 


OCT 
Ala 


GGT 
Gly 


AAG 
Lys 


CAA TTO GAA GAC GGT AGA 
Gin Leu Olu Asp Gly Arg 


1061 
54 


ACT 
Thr 


CTA TCT GAC 
Leu Ser Aap 


TAC 
Tyr 


AAC 
Asn 


ATC CAA 
He Gin 


AAG 
Lys 


GAA 

Glu 


TCC 
Ser 


ACT 
Thr 


TTA 
Leu 


CAT 
His 


TTO GTC TTO AGA TTA AGA 
Leu Val Leu Arg Leu Arg 


1121 
74 


GGT 

Qiy 


GGT^ATC ATT 
Gly lie He 


GAA 

Glu 


CCA 
Pro 


TCT TTO 
Ser Leu 


AAA 

Lys 


GCC 
Ala 


TTO 
Leu 


GCT 
Ala 


TCC 
Ser 


AAA 

Lys 


TAC AAC TGT GAC AAA TCT 
Tyr Asn Cys Asp Lys Ser 


1181 
94 


GTT 
Val 


TGT COT AAA 
Cys Arg Lye 


TOT 
Cys 


TAC 
Tyr 


OCC AGA 
Ala Arg 


TTA 
Leu 


CCA 
Pro 


CCA 
Pro 


AGA 
Arg 


GCT 
Ala 


ACC 
Thr 


AAC TOT AGA AAG AGA AAG 
Asn Cys Arg Lys Arg Lys 


1241 
114 


TGT 
Cya 


GGT CAC ACC 
Gly Hla Thr 


AAC 
Asn 


CAA 

Gin 


TTO COT 
Leu Arg 


CCA 
Pro 


AAG 
Lys 


AAG 
Lys 


AAG 
Lys 


TTO 
Leu 


AAA 

Lys 


TAA TCOATTTATTACOATCTCCA 


1306 
128 



CAAATCCAAAGTTTGTATACATCACGATTTTTTTACTACATATATATTTC 1386 

ATCTTAATATOGACCTCTCTTCACAAATTGTTCTATAATACA 1466 

TGGTATGCAAATACttCGAAATAAGAGTAAACXKIATACAGTOAGCCTGAAG 1546 

AOATATATOAGCTTAAAATTTAGATTTACTOAATATTATACAATAGTAATTATACATAAAGAAArrC 1626 

•• •••• ••••• •••• ••• 

CGATAGCAATTK3AAGAGGAGAGAGTTCTGTOAAACAAATAACAGCAGCA 1706 

A AAAAAAA A A A I A A A A A AOOAC AQTAAAQTTAAATTAAAAC0CACTAAATAATTTOQTO0T0GATCCTT 1775 



Fig. 3. Nucleotide sequence and deduced amino acid sequence of the UB12 locus. A stretch of 14 Ts in the upstream region (see StruhJ, 1985) is underlined. 
Other designations are as in Figure 2. 



the tail of a cysteine-containing sequence motif originally iden- 
tified within a 5S RNA gene-specific transcription factor, TFHIA 
(Miller et aL , 1985), and subsequently found in a number of other 
nucleic acid binding proteins (see below). 

UB12 contains an intron and encodes a ubiquitin jusion protein 
identical to that encoded by UBIJ 

The nucleotide sequence of the UB12 gene is shown in Figure 



3. Despite the - 15% divergence at the nucleotide sequence level 
between the coding regions of UBI1 and UBI2, they encode iden- 
tical fusion proteins (Figures 2 and 3). Although the 367-bp UBI2 
intron interrupts the ubiquitin-coding sequence of UBI2 at ex- 
actly the same position as in UBI1, the two introns differ in size 
and are not obviously homologous except for the essential se- 
quence elements that are generally conserved among yeast in- 
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UB 1 3 Am ^ AomcT ^ WTmrooATO 80 

AOTTTTATTCAOACOCACTCATTATCTTTOCTACATAACiTTTCTCTC^ 160 

GCATOTOGAQTCATAGOAQTAATTTTAAAGGTAGAATTTCATATTAAATAT^ 240 

AAGAOTOTTTCAAOTAAOTAAAAACATTTOAOCCTCCCCATTTOTTQAAAOa 320 

TATTTOATOGGTATATTUTTTOCAACO^ 400 

TCTTTAGTQCGATCTACCTOGGOTTAATGAACOAOAAQTTCTTO 480 

GQAATTQQQTTTATC ATTTATC ATTTATTTTAQTAC AAAC 1 TO T '1 II TOQCC(X}COCQCACTTTTTCAAQ(X)OTOOOA 560 

ACTCATCAAAATOAAAAACTAGATACTTTTAQACTTATTAAATOOTTTAAATATTTTOW 640 

TTCCTTACTTCrrATCTTTTATTCCA ATACAAAOAAQTC 720 

TQCCGACAAOCCAAO ATO CAA ATT TTC OTC AAO ACT TTA ACC GOT AAO ACT ATT ACC CTO GU 783 

Met Gin He Phe Val Lya Thr Leu Thr Gly Lya Thr lie Thr Leu Glu 16 



GTT 
Val 


OAA TCT TCT GAC ACT ATT GAC 
Glu Ser Ser Asp Thr lie Asp 


AAT 
Asn 


OTC 
Val 


AAO 
Lys 


TCC 
Ser 


AAO 
Lys 


ATC 
He 


CAA 

Gin 


GAC 
Asp 


AAO 

Lys 


GAA 

Glu 


GOT 

oiy 


ATT 
He 


843 
36 


CCA 
Pro 


CCT GAC CAA CAA AGA TTO ATC 
Pro Asp Gin Gin Arg Leu He 


TTT 
Phe 


GCT 
Ala 


GGT 

oiy 


AAO 

Lys 


CAA 

Gin 


TTO 
Leu 


GAA 

Olu 


GAT 
Asp 


GOT 
Oiy 


AGA 
Arg 


ACT 
Thr 


TTO 
Leu 


903 
56 


TCC 
Ser 


GAC TAC AAC ATC CAA AAO GAA 
Asp Tyr Asn He Gin Lys Glu 


TCT 
Ser 


ACT 

Thr 


CTA 
Leu 


CAC 
His 


TTO 
Leu 


GTC 
Val 


TTO 
Leu 


AGA 
Arg 


TTO 
Leu 


AGA 
Arg 


GGT 
Oiy 


GGT^ 

oiy 


963 
76 


GGT 
Gly 


AAO AAA AGA AAO AAO AAO OTC 
Lys Lys Arg Lys Lys Lys Val 


TAC 
Tyr 


ACC 
Thr 


ACC 
Thr 


CCA 
Pro 


AAO 
Lys 


AAG 
Lys 


ATC 
He 


AAO 
Lys 


CAC 
His 


AAO 
Lys 


CAC 
His 


AAO 
Lys 


1023 
96 


AAO 
Lya 


QTC AAO TTO GCT GTC TTO TCC 
Val Lys Leu Ala Val Leu Ser 


TAC 


TAC 
Tyr 


AAO 
Lys 


QTC 
Val 


OAT 
Asp 


OCT 
Ala 


GAA 

Olu 


GGT 
Gly 


AAO 
Lys 


GTT 
Val 


ACC 
Thr 


AAA 

Lys 


1083 
116 


TTO 
Leu 


AGA AGA GAA TOT AGC AAC CCA 
Arg Arg Glu Cya Ser Asn Pro 


ACT 
Thr 


TOT 
Cys 


GGT 

oiy 


OCT 
Ala 


GGT 

oiy 


GTT 
Val 


TTC 
Phe 


TTO 
Leu 


GCT 
Ala 


AAC 
Asn 


CAC 
His 


AAG 
Lys 


1143 
136 


GAC 
Asp 


AGA TTO TAC TOT GGT AAG TOT 
Arg Leu Tyr Cys Gly Lys Cys 


CAT 
His 


TCC 
Ser 


OTC 
Val 


TAC 

TSrr 


AAO 

Lys 


GTT 
Val 


AAC 
Asn 


OCT 
Ala 


TAA 


GTAAAGTATTTT 


1206 
152 



TAAAACTTATATATTTTAATTOATCGTTAAATTTTOAAAAAGGCTT 1286 

CAAAAQAACAAATOAATAGATAGACAGTAGAGGAATATAAGTAGTATOC^ 1366 

CTAATTTTCOTOOTTOT ATOCGTCTCTAAACAAOTC AATAT 1 1 1UC IQT AAO ATOGTTCTOCCOCTCCTTTC AGliWCT I 1446 

TAAGAAOCGTACCTGCAflATATTTTAACATCCTCCATOGTTTC 1 526 

ACATOTTTTTCWAATTTTCACTAAATACTTOCAAAACAGCATroAATC 1606 

CCTTTOAACTOACCAATCTTCCATCTTATTACACAGCTTTCTAATOOA 1686 

AAAOTOTTTTATTAQCCGAGTCATCGGOATOCGGGCTATATOTTACAGTTTCGTAGACTT^ 1766 
• • • ••••• ••• 

AAATTACAC&TTCTCATGGTCAAGCTCTTTTTATTTAAGTCTA 1M6 
TCCTAGCATCATCATOATCCATTTOGGAACACCTTOTTTTACAGTAATC 1*26 
CCTTCAATTCTCTTCCTAAAACGTCCACTOOTATOACATOTOTAG 2006 
OTTA 2010 

Fig. 4. Nucleotide sequences and deduced amino acid sequence of the VB13 locus. Designations are as in Figures 2 and 3. 

trons (Figures 2, 3 and 6). The flanking regions of the UBIJ and UBJ3 encodes a ubiqu^n jusion protein distinct from that en- 
UBJ2 genes are also largely non-homologous, suggesting that coded by UBIJ ana UBu 

UBJ1 and UBI2, while encoding identical proteins, may be dif- The nucleotide sequence of the UBI3 gene is shown in Figure 
ferentially regulated in vivo. 4. In contrast to the UBIJ and UBI2 genes, the coding region 
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UBI4 

A TATOCCOQ ATTCTCTTTOQ ATCCCCTOQCOQ AAQ AAQ ACA0TO AAQ AT ACOCCtXKKX) T0OCGQQT0O AAT AAO CA AGC 160 

TCTCATTTAQACCCAAOTCTTCAAATAACCCOTTCAOATOAT^ TO ACCAATC ATCTTATTCQCOC AO 240 

GGC AAC CCA TO AT AGO AAATOTCCCTTT AAACO AACTQ TOGO AAATCCTGC A AAAO TO ATO ALIO TO ACQ TTTT ATO AAA 320 

10 AA AO TO I lAAAQ T T CTTOt 1 IU AC AAT ATTCT ATTCCXl AAAAfl TCTTTOTOOTC AOTAT TO TTTCT AO AACQ TTCTAO 400 

AATAATCCTOOATAAACCAATTTCOOTAC CAA kkkkkkk iAT AQCCQCCCAQTOQAQCATCACACACOQTACATCOQQCC 480 

ACAC ACOOTOOTTACCCCAG I ICU 1 1 VL IT 11 tTC AOOCOCO ATOCCACTTATCAO I'lUllIM AO AATTCAOTOTCATT 560 

TTCTTCOQ AAAO TOCTACTTC AO AAAO AOC AAO AACTOT^ 720 

CAAOAACTCTCOAACTCTCCCrrCCCACTTTACTTrAACTAATAOATT ATQ CAO ATT TTC OTC AAO ACT TO 791 

Hat Ola He Phe Val Lys Thr Leu 8 

ACC GOT AAA ACC ATA ACA TTO QU QTT OAA TCT TCC OAT ACC ATC OAC AAC OTT AAO TOO 851 

Thr Oly Lys Thr He Thr Leu Olu Val Olu Sor Ser Asp Thr U a Asp Asa Val Lye Sar 28 

AAA ATT CAA OAC AAO OAA GOT ATC CCT CCA OAT CAA CAA AOA TTO ATC TTT OCC GOT AAO 911 

Lys Ua Ola Asp Lya Olu Oly lie Pro Pro Asp Ola Ola Arg Liu Xla Pha Ala Gly Lys 48 

CAO CTA OAA OAC GOT AOA ACQ CTO TCT QAT TAG AAC ATT CAO AAO GAG TCC ACC TTA CAT 971 

Gla Leu Glu Asp Oly Arg Thr Leu Ser Asp Tyr Asa Ho Olo Lys Ola Sar Thr Leu His 68 

CTT OTQ CTA AGO CTA AOA GOT GOtTaTO CAO ATC TTT OTT AAQ ACT TTO ACC GOT AAA ACC 1031 

Leu Val Lau Arg Leu Arg Oly Oly Mat Ola Ha Pha Val Lya Thr Leu Thr Oly Lys Thr 88 

ATC ACT TTA GAG OTT OAA TCC TCC OAC ACQ ATC OAT AAC QTT AAQ TOO AAA ATC CAO OAC 1091 

lie Thr Lau Olu Tal Olu Sar Sar Asp Thr Ua Asp Asa Val Lys Sar Lys Ha Ola Asp 108 

AAQ OAA GOT ATC CCT COS GAT CAA CAO AGO TTO ATC TTT GCC GOT AAQ CAO CTA OAA OAT 1151 

Lys Olu Oly Xla Pro Pro Asp Gla Gla Arg Lau lie Pha Ala Oly Lys Gla Lau Olu Asp 128 

GOT AOA ACC TTO TCT OAC TAC AAC ATC CAA AAQ OAA TCT ACT CTT CAC TTO OTO TTO AOA 1211 

Gly Arg Thr Lau Sar Asp Tyr Asa Ua Ola Lys Glu Sar Thr Lau His Lau Val Lau Arg 148 

CTO AOA GOT GOtTiTO CAA ATT TTT GTC AAO ACA CTO ACA GOT AAO ACT ATA ACC CTA GAG 1271 

Lau Arg Oly Gly Hat Ola lie Pha Val Lya Thr Lau Thr Oly Lys Thr Ua Thr Lau Olu 1(8 

OTT OAA TCT TCT GAC ACT ATC GAC AAC OTT AAO TCO AAA ATT CAA OAC AAQ OAA GOT ATT 1331 

Val Olu Sar Sar Asp Thr lie Asp Asa Val Lya Sar Lys Ua Ola Asp Lys Glu Gly Ua 188 

CCT CCA OAT CAA CAA AOA TTO ATT TTT OCT GOT AAQ CAA CTO CAA OAC GOT AOA ACQ CTO 1391 

Pro Pro Asp Ola Ola Arg Lau Ua Pha Ala Oly Lys Gin Lau Olu Asp Gly Arg Thr Lea 208 

f 

TCT GAT TAT AAC ATT CAO AAA GAG TCT ACQ TTO CAT TTO OTQ TTO AOA TTO AOA GOT GOT 1451 

Sar Asp Tyr Asa lie Ola Lys Olu Ser Thr Lau His Lau Val Lea Arg Lau Arg Oly Oly 228 

ATO CAA ATT TTC OTC AAA ACT CTA ACA GOO AAQ ACT ATA ACC CTA GAG OTT OAA TCT TCC 1511 

Het Ola lie Pha Val Lys Thr Leu Thr Gly Lys Thr Ua Thr Lau Glu Val Glu Sar Sar 248 

OAC ACT ATT OAC AAC OTC AAA AOT AAA ATT CAA OAT AAA OAA GOT ATC CCT COO OAT CAA 1571 

Asp Thr Ua Asp Asa Val Lya Sar Lys Ua Ola Asp Lys Glu Gly Ua Pro Pro Asp Gla 268 

CAO AOA TTO ATT TTT OCT GOT AAQ CAA CTA OAA OAT GOT AOA ACC TTO TCT OAC TAC AAC 16J1 

Gla Arg Lau lie Pha Ala Oly Lys Ola Lau Olu Asp Oly Arg Thr Leu Sar Asp Tyr Asa 288 

V 

ATC CAA AAO OAA TCT ACT CTT CAC TTO OTO TTO AOA CTO AOA GOT GOT ATO CAA ATT TTT 1691 

Ua Gla Lys Olu Ser Thr Lau His Lau Val Lau Arg Lau Arg Oly Gly Hat Ola Ua Pha 308 

OTC AAO ACA CTO ACA GOT AAO ACT ATA ACC CTA GAG OTT OAA TCT TCT OAC ACT ATT OAC 1751 

Val Lya Thr Lau Thr Oly Lys Thr Ua Thr Lau Olu Val Olu Sar Sar Asp Thr Ua Asp 328 

AAC OTT AAQ TCO AAA ATT CAA OAC AAQ OAA GOT ATT CCT CCA OAC CAO CAA AOA TTO ATT 1811 

Asa Val Lys Sar Lys Ua Ola Asp Lys Olu Gly Ua Pro Pro Asp Gla Gla Arg Lau Ua 348 

TTT OCC GOT AAQ CAA CTA QU GAT GOT AGA ACQ CTO TCO OAC TAC AAT ATT CAA AAO GAG 1871 

Pha Ala Oly Lys Ola Leu Glu Asp Oly Arg Thr Leu Ser Asp Tyr Asa Ua Ola Lys Olu 368 

TCC ACT CTT CAC CTT OTC TTO AGO TTO AGO GOT GOT I AAC I TO A TCAG TCCTCOC AAT ATTTTCATTA 1937 

Sar Thr Lau Rls Leu Val Leu Arg Leu Arg Oly Gly | Asa | 381 

TOTCAATATATATATOTTTACTCTCCl rTTTTCTTTTTO O TTTTTTTTTTTTTl iu ATAAATACTCCATAOAAC ACTAAA 2017 

• •• 

TAAATTGTTCAACTOTOTTATTGTCTTrATTCATOTTOGTTTTC 2097 

• ••• 

TTC ACT ATTTTCG CO AACCCGGOT AATACC ATTAGCTATTTTG AT AO AAAGGG ATTTTTATTAGGO AATATAACC AC ATT 2177 

T AAAO TOTCCT ATC ATGTTTCAATCTCCAO TAAACOCAC A T AAGCCO ACCAATTG AG TC AACC1TTTAACTCTATTT AAT 2257 

TTO AT ACXXMT AG AATATTO TO ACTACCA AAAGGO AAAAGGC AO AAAAAAGG A TTAAATOTTAG AOT 2357 

CTTTAQCTOCAATTTOCAMCOQTTQCAOOCTCAOATOTGGAAA 2381 
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of UBI3 lacks an intron (Figure 4). While the amino acid se- 
quence of ubiquitin encoded by UBI3 is identical to those en- 
coded by UBU, UBI2 and UBU, the amino acid sequence of 
the 76-residue UBI3 tail is quite different from the 52-residue 
tail of the UBI1 and UBI2 proteins (Figures 2 -4). At the same 
time, the tails of both UBI1/UBI2 and UBD fusion proteins are 
basic, containing 31 % and 29% lysine/arginine residues respec- 
tively. In addition, both tails contain putative nuclear localiza- 
tion sequences, although in the UBD protein this sequence is 
present at the beginning of the tail rather than at its end as in 
the UBI1 and UBI2 proteins (Figures 2 and 3). Moreover, a 
TFIIIA-like, putative nucleic acid-binding motif present in the 
UBI1 and UBI2 tails is also found in the UBD tail. As in the 
case of the UBI1 and UBI2 fusion proteins (see above), it is likely 
but not certain that the UBD fusion protein is cleaved in vivo 
between residues 76 and 77 to yield mature ubiquitin and a free 
tail protein. 

UBI4 encodes a polyubiquitin fiision protein 
The complete nucleotide sequence of the UBI4 gene, determin- 
ed in part previously (Ozkaynak et al. , 1984), is shown in Figure 
5. The UBU gene encodes a ~43-kd primary translation pro- 
duct composed of five identical repeats of the ubiquitin amino 
acid sequence joined head-to-tail, without spacers (Figure 5). To 
generate mature ubiquitin from this putative precursor protein, 
a processing protease would have to cleave at Gly-Met junc- 
tions between the adjacent repeats (Figure 5). Although the poly- 
ubiquitin-processing protease remains to be characterized 
biochemically, a protease that de-ubiquitinates ubiquitin— 0- 
galactosidase fusion proteins in vivo has recently been identified 
by Bachmair et al. (1986). The substrate requirements of this 
enzyme are those expected of a protease that processes both 
polyubiquitin and the other natural ubiquitin fusions encoded by 
the UBU, UBI2 and UBI3 genes. 

Remarkably, the last of the five ubiquitin repeats in the UBI4 
protein is followed by a single Asn residue (Figures 5 and 6). 
Most polyubiquitin precursor proteins in higher eukaryotic species 
also contain an extra amino acid residue at the end of their last 
ubiquitin repeat, the extra residue being different in polyubiquitins 
from different species (reviewed by Finley and Varshavsky, 
1985). By blocking the carboxyl terminus of the preceding Gly 
residue, the extra carboxyl-terminal residue could serve to pre- 
vent participation of unprocessed polyubiquitin in ubiquitin -pro- 
tein conjugation. The properties of a recently characterized 
ubiquitin-specific hydrolase from mammalian cells that cleaves 
small adducts off the carboxyl terminus of ubiquitin (Pickart and 
Rose, 1986) are compatible with its involvement in removing 
this extra residue. Both the function of the extra residue and the 
identity of the protease that is resposible for its removal in vivo 
can now be addressed directly using the methods of yeast 
molecular genetics. 

The polyubiquitin gene of S. cerevisiae was previously isolated 
from a Xgtl 1 -based library (Ozkaynak et al. , 1984). The UB14 
gene cloned in the present work (Figure 5) was isolated from 
an independently obtained, plasmid-based library. That the UBI4 
gene cloned earlier (Ozkaynak et al. , 1984) contains six ubiquitin- 
coding repeats, in contrast to five such repeats in the UBI4 gene 
of the present work (Figure 5), could be due either to instability 
of the number of ubiquitin-coding repeats upon propagation of 



the libraries or to a natural variation in the number of repeats 
from strain to strain within the same species of yeast. 

Other aspects of the UBU - UBI4 genes 
A striking feature of the 5 ' flanking region of UBU is the presence 
of an 18-bp, rotationaUy symmetric sequence 365 bp upstream 
of the first codon of UBI4 (Figure 5). The middle 14 bases of 
this 18-bp sequence (Figure 5) contain an exact homology to the 
rotationally symmetric consensus 'heat shock box* nucleotide se- 
quence that has previously been shown to confer stress inducibility 
when placed upstream of heterologous promoters (Pelham, 1982; 
Parker and Topol, 1984; Wu, 1984; Shuey and Parker, 1985). 
The upstream regions of UBI4 and UBI3 also contain several 
weaker heat shock box homologies whose statistical significance 
is marginal. The presence of the heat shock box homologies in 
the 5' flanking region of the UBI4 gene is consistent with the 
stress inducibility of UBI4 (see below), and with its specific func- 
tion of providing ubiquitin to cells under stress (Finley et aL 9 
1987). 

The upstream regions of UBJ2, UBI3 and UBI4 (but not of 
UBU) also contain stretches of poly(dA)-poly(dT) (underlined 
in Figures 3—5). Such sequences can function as upstream pro- 
moter elements for constitutive transcription in yeast (Struhl, 
1985). While several putative TATA boxes (Hahn et al , 1985; 
Nagawa and Fink, 1985; McNeil and Smith, 1986) are present 
upstream of the UBI2, UBI3 and UBI4 genes (Figures 3-5), 
there are no obvious TATA box homologies upstream of the UBU 
gene (Figure 2). 

Differential expression of UBU -UBU genes 
The results of Northern hybridization analysis of UBU— UBU 
expression in stationary-phase and exponentially growing yeast 
cultures, using probes specific for each of the four genes, are 
shown in Figure 7. The bulk of the UBU — UBU mRNAs are 
apparently polyadenylated, since they are retained on an oligo- 
(dT) -cellulose column (data not shown). All four genes are ex- 
pressed in exponentially growing cells, but expression of UBU 
and UBD, is virtually undetectable in stationary-phase cells (Figure 
7). UBI3 is transcribed in both growing and stationary-phase cells 
(Figure 7, lanes e,f; note the change in size distribution of the 
UBI3 mRNA species between growing and stationary-phase 
cells). In striking contrast, UBI4 expression is much greater in 
stationary-phase cells than in growing cells (Figure 7, lanes g,h). 
The stress-specific expression of the UBU gene accounts at least 
in part for the stress-specific phenotype of mutants that lack UBU 
(Finley et al., 1987). 

The tails of UBU - UBI3 proteins are conserved between yeast 
and higher eukaryotes 

The amino acid sequence of ubiquitin is identical in mammals, 
frogs, fish and insects, and differs in only three of 76 residues 
from the sequence of yeast ubiquitin (Figure 8 and Ozkaynak 
et al., 1984). Ubiquitin from plants is even more homologous 
to yeast ubiquitin (two differences over 76 residues) (Vierstra 
et al , 1986). Remarkably, the tails of the UBI1 -UBD proteins 
are also strongly conserved in evolution. Specifically, the deduced 
amino acid sequence of the tail of the UBD protein is -67% 
homologous to the deduced sequence of its putative human 
counterpart (Figure 8 and Lund et al, 1985). Only a partial 
deduced amino acid sequence of a putative mouse counterpart 



Fig. 5. Nucleotide sequence and deduced amino acid sequence of the UBI4 locus. The underlined upstream sequence at bases 384-402 is a rotationally 
symmetric 18-bp sequence with an exact copy of the 14-bp consensus heat shock box sequence (Pelham, 1982; Shuey and Parker, 1985) at its center (doubly 
underlined). Weaker, statistically marginal homologies to the heat shock box occur also at positions 76- 89, 360-367 and 721-731. The 'non-ubiquitin' Asn 
residue at the end of the polyubiquitin precursor protein is boxed. Other designations are as in Figures 2 and 3. 
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UBI1 



UBI2 



UBI3 



1ZZZL 
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Fig. 6. Organization of yeast UB11-UBI4 genes. The relative sizes of 
coding elements and' introns in UBU and UBI2 are drawn approximately to 
scale. Open blocks denote the 228-bp ubiquitin-coding elements. Two 
striped blocks and a dark block denote tail-coding elements in UBU, UBI2 
and UBI3 respectively. The tail amino acid sequence in the polyubiquitin 
product of UBI4 consists of a single residue, Asn. See Figures 2-5 for 
details. 



of the UBI1 and UBI2 tail sequence is known at present (Figure 
8 and St. John et aL , 1986). In this case as well, the degree of 
homology in the region currently available for comparison is very 
high ( ~ 74%). The tail amino acid sequences are thus conserv- 
ed to a remarkably high degree over great evolutionary distances, 
and therefore are likely to have specific if not essential functions. 
A search for similarities between the sequences of the 
UBI1 -UBI3 tails and known proteins using the National Bio- 
medical Research Foundation database and the algorithm of Lip- 
man and Pearson (1985), has not revealed statistically significant 
homologies (data not shown). 

The UBU - UBI3 tails contain putative metal-binding, nucleic 
acid-binding domains 

Recent findings by Miller et aL (1985) indicate that the recogni- 
tion by transcription factor TFIIIA of an internal control region 
in the Xenopus 5S RNA gene is mediated by nine quasi-repeats 
of a -30-residue domain that contains a coordinated Zn 2+ ion 
and has several specifically arranged Cys and His residues per 
domain. The authors proposed that the repeating units in TFIIIA 
represent distinct Zn 2 "•"-binding domains that interact with 
nucleic acids. Each of the nine domains in TFIIIA is expected 
to bind - 5 bp of DNA (Rhodes and Klug, 1986). A subsequent 
search for homologous sequences in other proteins has revealed 
that potential metal-binding domains occur in a number of pro- 
teins that have been implicated in nucleic acid binding (Berg, 
1986; Harrison, 1986). As shown in Figure 9, the UBU, UBI2 
and UBD tails each contain a sequence that matches a generalized 
(Berg, 1986) consensus sequence of the TFIIIA motif. 

Discussion 

The results of this work indicate that the ubiquitin-coding se- 
quences of yeast comprise a family of natural gene fusions. Thus, 
in S. cerevisiae ubiquitin is invariably a product of post- 
translational processing of precursor proteins in which ubiquitin 
is joined either to itself, as in the polyubiquitin (UBI4) protein, 
or to unrelated amino acid sequences, as in the other ubiquitin 
fusion proteins, UBI1-UBI3. As shown elsewhere (Finley et 
al. , 1987), at least one ubiquitin-coding locus, UBI4 (the polyubi- 
quitin gene), is not essential for viability of growing, unstressed 
vegetative cells. It is, however, absolutely required for resistance 
of cells to high temperatures, starvation and other stresses (Finley 
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Fig. 7. Expression of UBIJ—UBI4 in growing and stationary cells. Total 
yeast RNA was electrophoresed, blotted and hybridized to DNA probes as 
described in Materials and methods. Equal amounts of total RNA were 
applied onto each lane. The same filter was used for sequential 
hybridizations with labeled DNA probes specific for either UBI1 (lanes a, 
b), UBI2 (lanes c, d), UBIS (lanes e, f) or UBI4 (lanes g, b). The 
XW/-specific probe was an 0.34-kb HindOU Hindi fragment. The 
£/B/2-specific probe was the 0.34-kb StyVHincU fragment. The 
IWJ-specific probe was an 0.51-kb Taq\ fragment. These three probes 
contain 3' flanking and tail-coding sequences of UBU, UBI2 and UBI3, 
respectively. The C/fl/4-derived probe (a BstXl/BcH fragment) also cross- 
hybridized to ubiquitin-coding elements in the non-UBI4 RNAs. Labels at 
the tops of the lanes denote RNA isolated from exponentially growing (exp.) 
and stationary, (stat.) cultures, respectively (see Materials and methods). The 
indicated RNA sizes are in kilobases. 

et aL, 1987). The requirement for UBI4 function in ubi4 dele- 
tion mutants can be satisfied by an in Wfro-constructed UBI4 
minigene that contains the flanking sequences of UBI4 but only 
a single ubiquitin-coding repeat, indicating that the required func- 
tion of UBI4 is to provide mature ubiquitin monomers rather than 
polyubiquitin per se (Finley et aL , 1987). Since the polyubiquitin 
gene functions specifically during stress (Finley et aL, 1987), 
its repetitive structure may provide an additional selective ad- 
vantage in reducing the metabolic cost of ubiquitin synthesis under 
conditions in which metabolic activities are severely compromis- 
ed. In addition, formation of the spacerless polyubiquitin gene 
(presumably from a monoubiquitin gene) in the course of evolu- 
tion may have been greatly facilitated by the prior existence of 
ubiquitin carboxyl-terminal proteases (Finley and Varshavsky, 
1985; Bachmair et aL, 1986; Kanda et al., 1986; Pickart and 
Rose, 1986). Such proteases, having evolved presumably to 



The yeast ubiquitin genes 



yeast ubiquitin : MQIFVKTLTGKTITLEVESSDTIDNVTCSKIQDK^ 

ubiquitin : P E A 



yeast UBI1 , 2 tail : . . . IIEPSLKAIASKYNCDKSV . . . 
mouse : • . • RQ — Q MR. . . 



yeast UBI3 tail 
human 



, . . GKKRKKKVYTTPKKlKHKHKKVKiJlV^ 

, . .A S N R K EN— ISR PSDE RM-S-F — H CLT-CF-KPEDK 



Fig. 8. Conservation of the deduced amino acid sequences of the tails of UBI1 -UBI3 proteins between yeast and mammals. The deduced amino acid 
sequences of yeast ubiquitin and of the tails encoded by UBI1 -UBI3 are taken from Figures 2-5. The deduced amino acid sequence of a human homolog of 
the UBI3 protein is from Lund et al. (1985). A portion of the mouse gene which appears to encode a counterpart of the yeast UBU (UBE) protein has 
recently been isolated by St. John et al. (1986). Shown here is the currently known portion of the tail amino acid sequence encoded by the putative mouse 
homolog of the UBU, UBI2 genes. 
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Fig. 9. Putative metal-binding, nucleic acid-binding domains in UBI1-UBI3 proteins. The consensus sequence shown (Berg, 1986) in which C and H stand 
for cysteine and histidine, respectively, is a generalized version of the motif originally found in the transcription factor TFIIIA of Xenopus and proposed to 
represent a Zn 2+ -binding domain that interacts with nucleic acids (Miller et al. t 1985). Homologous motifs have since been detected in many other nucleic 
acid-binding and metal-binding proteins (reviewed by Berg, 1986; Vincent, 1986; Harrison, 1986). 

sequence motif, which includes several specifically arranged Cys 



recycle ubiquitin from post-translationally formed ubi- 
quitin -protein conjugates, were thus 'pre-adapted' to process 
polyubiquitin and other ubiquitin fusion proteins. 

Since tree ubiquitin is found (at normal levels) in unstressed 
cells with an engineered deletion of the UBI4 gene (Finley et 
aL 9 1987), at least one of the UBI1 -UBI3 proteins is expected 
to be processed in vivo to yield mature ubiquitin. These findings, 
and the expression patterns of the UBU-UBI4 genes (see 
Results) suggest that mature ubiquitin is derived largely from at 
least one of the UBI1 -UBI3 fusion proteins in unstressed cells, 
whereas in stressed wild-type cells it is derived largely from the 
polyubiquitin precursor protein, UBI4. The stress-specific ex- 
pression of the UBI4 gene accounts at least in part for the stress- 
specific phenotype of mutants that lack UBI4 (see Finley et al. , 
1987 for further discussion of the function of UB14). 
Possible Junctions of the UBI1-UBI3 proteins 
The genes UBU, UBI2 and UBI3 encode hybrid proteins in which 
ubiquitin is fused to unrelated (tail) amino acid sequences (Figure 
6). As is the case with ubiquitin itself, the tail amino acid se- 
quences are conserved to a remarkably high degree over great 
evolutionary distances (Figure 8). 

In addition to being basic and containing putative nuclear 
localization signals (see Results), the tails of UBI1, UBI2, and 
UBI3 proteins each contains a regi on tha t matches a generalized 
consensus sequence of the so-called TFIIIA motif (Figure 9). This 



or His residues, had been originally identified in the Xenopus 
transcription factor TFIIIA, where it was found repeated nine 
times (Miller et al. , 1985). The corresponding nine domains of 
TFIIIA each contains a coordinated Zn 2+ ion, and have been 
proposed to recognize ~ 5 bp of DNA per domain (Rhodes and 
Klug, 1986). A subsequent search for homologous sequences in 
other proteins has shown that the TFIHA-like, potential metal- 
binding domains occur in a number of proteins, many of which 
have been implicated in nucleic acid binding (reviewed by Berg, 
1986; Harrison, 1986; Vincent, 1986). 

The tails of the UBI1 -UBD proteins may exist in vivo either 
exclusively within the ubiquitin-containing fusion proteins (Figure 
6) or in addition as free tail proteins released by cleavage at the 
ubiquitin -tail junction. Release of the tail proteins in vivo is like- 
ly, given the recently identified substrate requirements of a 
ubiquitin-specific processing protease (Bachmair et al., 1986), 
and the fact that at least one of the UBI1 -UBI3 proteins must 
be processed in vivo to yield mature ubiquitin (see above and 
also Finley et of.., 1987). 

Whether the function of the UBI1 -UBI3 tails involves specific 
DNA binding is not known. This possibility is being addressed 
by experiments in which an in Wfra-synthesized tail protein is 
tested for specific DNA binding using the electrophoretic ap- 
proach of Hope and Struhl (1985). Since each tail is longer than 
its single TFIHA-like motif, the putative DNA-binding site of 
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the tail may actually encompass significantly more than ~ 5 bp, 
which is the approximate length of DNA recognized by a single 
TFTOA domain (Rhodes and Klug, 1986). What might be the 
specific functions of the tail proteins? One possibility is that the 
tails function as ubiquitin-free, nucleic acid-binding proteins and 
act as regulators of gene expression. A free tail protein might 
also resemble ubiquitin in being able to form post-translational 
tail -protein conjugates with specific acceptor proteins. A third 
possibility is that each tail functions as part of the ubiquitin- 
containing fusion protein by binding to specific DNA sites in vivo, 
thereby greatly increasing the concentration of ubiquitin in the 
vicinity of such sites. Regulated proteolytic release of ubiquitin 
from the DNA-bound fusion protein might then result in ubi- 
quitination events that would be limited to specific sites in the 
chromosomes. Finally, unprocessed ubiquitin -tail fusion pro- 
teins may function as regulators of ubiquitin-dependent protein 
degradation. This is suggested by the fact that, being 
ubiquitin-protein fusions, the UBI1 -UBI3 proteins resemble 
post-translationally formed ubiquitin-protein conjugates, which 
are intermediates in ubiquitin-dependent protein degradation (for 
a recent discussion, see Bachmair et al , 1986). In this model, 
the ubiquitin -tail fusion protein would be bound but not degraded 
by the ubiquitin-dependent protease, and as a result the degrada- 
tion of post-translational ubiquitin-protein conjugates would be 
competitively inhibited. 

The in vitro DNA-binding experiments described above, 
together with immunological studies using tail-specific antibodies, 
and a detailed deletion analysis of the UBI1-UBI3 genes, cur- 
rently under way, should reveal the functions of the tail proteins 
in yeast. The high degree of evolutionary conservation of the 
tail amino acid sequences suggests that the functions thus under- 
stood will be relevant in other eukaryotes as well. 

Materials and methods 

Isolation of DNA and RNA 

Yeast culture media were prepared essentially as described by Sherman et at 
(1981). Yeast genomic DNA was isolated from spheroplasts as described by 
Winston et al. (1983). For preparation of total RNA, yeast cultures were grown 
in YPD medium at 23 °C and harvested either at low density (exponential cultures) 
or 3 h after their apparent optical density at 600 nm reached a plateau value (sta- 
tionary cultures). The cells were centriroged at 3000 g for 5 min, resuspended 
in 20 ml of 4.2 M guanidine thiocyanate at 5- 10 x 10 8 cells/ml. immediately 
mixed with -3 ml of 0.5-mm glass beads, and vortexed at the highest setting 
for 4 min. The lysate was centrifuged at 8000 g for 5 min, and RNA was purified 
from the supernatant by centrifugation through a cushion of 5.7 M CsCl, essen- 
tially as described by Chirgwin et al. (1979). 

Hybridization analysis 

Purified yeast DNA was digested with restriction endonucleases as recommend- 
ed by their suppliers, electrophoresed in agarose gels in TAE buffer (Maniatis 
el al. , 1982), and transferred to GeneScreen filters (New England Nuclear) by 
the method of Reed and Mann (1985). The filters were air-dried, heated at 80°C 
for 1 h, and irradiated with 254 nm u.v. light (20 s at - 1.2 mW/cm 2 ) to cross- 
link DNA to the filter (Church and Gilbert, 1984). Hybridization was carried 
out in lucite cylinders rotating in a roller bottle apparatus at -0.5 rev/min for 
- 15 h at 35*C. The hybridization solution contained 30% formamide, 4.5% 
SDS, 0.34 M NaCl, 1 mM Na-EDTA, 10 mg/mJ bovine serum albumin, 0. 16 M 
Na-phosphate (pH 7.0). For Northern hybridization purified RNA samples were 
treated with glyoxal and electrophoresed in 1 % agarose gels as described by Car- 
michael and McMaster (1980). After electrophoresis, the gel was incubated on 
a rocker platform in 50 mM NaOH for 20 min, then rinsed with water and fur- 
ther incubated in 0.2 M Na-acetate (pH 4.0) for 30 min. RNA was then transferred 
by blotting in 25 mM Na-phosphate (pH 7.0) onto a GeneScreen filter. Subse- 
quent steps were identical to those described above for Southern hybridization 
except that the hybridization temperature was 42°C and the hybridization solu- 
tion contained 45% formamide. For additional hybridizations using the same filter, 
the hybridized probe was removed by incubating the filter in 0. 1 % SDS, 1 mM 
Na-EDTA, 1 mM Tris-HCl (pH 7.5) at 75°C for 1 h. The two sets of hybridiza- 



tion conditions listed above yielded relatively low-stringency Southern hybridization 
and high-stringency Northern hybridization respectively. 

Cloning of S. cerevisiae ubiquitin genes 

UBU-UB13. Yeast genomic DNA from strain SUB 16 in which the polyubi- 
quitin gene (UBI4) had been deleted (Finley et al. , 1987), was digested with both 
Hindm and EcdRl, and electrophoresed in an agarose gel. DNA was eluted from 
gel slices containing either ~3.5-kb DNA fragments (including the UBI3 gene) 
or ~ 1 .8-kb DNA fragments (including the UBU and UBI2 genes) (see also main 
text and Figure 1). A library was made from the ~3.5-kb DNA sample by ligating 
the DNA to HindSXUEcdRl-ait replicative form (RF) DNA of phage M13mp9 
(Messing, 1983). Two other libraries were made by ligating the — 1 .8-kb DNA 
fragments either to Mndm/£coRI-cut M13mp9 RF DNA or to #mdIII-cut and 
dephosphorylated M13mp9 RF DNA using standard techniques (Maniatis et al. , 
1982). The resulting libraries were transformed into Escherichia coli strain JM101 
(Messing, 1983), and the plaques were screened by hybridization at low stringency 
(Benton and Davis, 1977) with a yeast polyubiquitin (UBI4) gene probe (Ozkaynak 
et al. , 1984; see also below) that had been labeled with * 2 P by the method of 
Feinberg and Vogelstein (1984). RF DNAs were prepared from purified positive 
plaques of each of the three libraries. The DNA inserts were examined by gel 
electrophoresis and Southern hybridization with the polyubiquitin (UB/4) gene 
probe. Since the initially obtained clones of VBI2 and UB13 lacked the multiple 
cloning site of the M13mp9 vector that is required for application of the unidirec- 
tional deletion technique (see below), the corresponding DNA inserts were electro- 
phoretically purified, made blunt-ended (Maniatis et al., 1982), and ligated to 
a HincII-cut M13mp9 RF DNA prior to sequencing. 

Sequencing of the initially obtained UBU and UBI2 clones showed that they 
lacked portions of their coding and flanking sequences due to internal HindSR 
cuts in genomic DNA used for cloning. To clone the complete UBU and UB12 
genes, probes were prepared from unique, non-ubiquitin portions of the initial 
UBU and UBU clones. One probe (a ~ 1 .4-kb HindWJSstl fragment) spanned 
the 5' flanking region and intron of the UBU gene, and the other (a — 1.1-kb 
AccVEco RI fragment) spanned the 3' flanking region of the UB12 gene. A ptasmid 
Ycp50-based library of ~ 10-kb yeast genomic DNA inserts from strain S288C, 
carried in E. coli strain HB101 , was obtained from Dr P.Novick (Yale Universi- 
ty), screened with the above probes, and positive clones were purified. A 
t/B// -containing Ycp50 plasmid was cut within the UBU intron with Ssrl and 
at a 3' flanking site with Bgttl. The resulting - 3-kb fragment was blunt-ended 
and ligated to Hincll-cut M13mp9 RF DNA for sequencing. A t/fl/2-containing 
Ycp50 plasmid was cut at 5' and 3' flanking sites of UBU with EcoHl and Accl, 
respectively, and the resulting - 1.2-kb fragment was subcloned as above into 
M13mp9 RF DNA for sequencing. 

UBI4. A yeast polyubiquitin (UBI4) gene probe derived from the originally ob- 
tained polyubiquitin clone carried in Xgtll (Ozkaynak et al., 1984) was used 
to screen the Ycp50-based genomic library described above. Plasmids from strongly 
positive colonies were purified and analysed by Southern hybridization. One 
plasmid thus obtained, pUB6, was digested with HindUl and the -3-kb, 
C/£/4-containing fragment was isolated for sequencing. 
DNA sequencing 

The unidirectional deletion procedure used to prepare the UBU, UBU. and UBI3 
clones for DNA sequencing by the chain termination method (Sanger et al. , 1977) 
was essentially as described by Putney et al. (1981) and Yanisch-Pemon et al. 
(1985). For sequencing of the UBU gene, nine overlapping DNA fragments were 
generated from the ~3-kb, i/B/4-containing DNA fragment (see above) using 
appropriate restriction endonucleases and the previously available partial sequence 
of the UBU gene (Ozkaynak et al , 1984). The fragments were purified and sub- 
cloned into M13mp9 RF DNA for sequencing. Coding regions of UBU-UBI3 
were sequenced on both strands. In addition, the sequence was determined at 
least twice in regions of initial ambiguity. 
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A cDNA clone from the slime mold Dictyostelium discoideum that encodes a 
ubiquitin-taU protein strongly homologous to the UBIl/UBE protein of yeast has 
recently been described by M. Westphal, A. Mullcr-Taubenberger and G. Gerisch 
(1986) FEBS Lett., 209, 92-95. 
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